Effects of Centralized and Distributed Version Control on Commit Granularity

نویسندگان

  • Jochen Wuttke
  • Ivan Beschastnikh
  • Yuriy Brun
چکیده

Version control systems are critical for coordinating work in large software engineering teams. Recently, distributed version control (DVC) systems have become popular, as they have many advantages over their centralized (CVC) counterparts. DVC allows for more frequent commits, and simplifies branching and merging. These features encourage developers to make smaller, finergrained commits that do not interleave changes related to different development tasks. Such commits improve accountability and ease certain tasks, such as reverting changes that later cause problems. DVC systems are also better suited for repository mining techniques, making available more useful information about the development process [2]. For example, approaches that infer collaboration patterns can benefit from the more detailed attribution of data in DVC. This can be used by an integration server to send email about failed test cases to just the subset of developers who authored the relevant code. DVC may also lead to smaller and more focused commits, which could benefit mining techniques that identify changes relevant to specific development tasks, such as refactorings [3]. However, to date, there has been no explicit evaluation of the practical differences in mining DVC over CVC, though some work acknowledges that developers might use DVC and CVC differently [1]. We report on such an evaluation with one counterintuitive finding that raises doubts about certain DVC promises and opens research questions into what causes DVC and CVC differences. Further, our finding indicates that repository type should be controlled for in repository mining experiments. BODY In a study of six CVC and six DVC repositories we found that the median size of code commits in DVC is 38% larger than in CVC.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed multi-agent Load Frequency Control for a Large-scale Power System Optimized by Grey Wolf Optimizer

This paper aims to design an optimal distributed multi-agent controller for load frequency control and optimal power flow purposes. The controller parameters are optimized using Grey Wolf Optimization (GWO) algorithm. The designed optimal distributed controller is employed for load frequency control in the IEEE 30-bus test system with six generators. The controller of each generator is consider...

متن کامل

Simulations of the implementation of primary copy two-phase locking in distributed database systems

This paper considers algorithms for concurrency control in Distributed database (DDB) systems. Below are the simulating models of the implementation of two-phase locking (2PL) in DDB. From four types 2PL in DDB (Centralized 2PL, Primary copy 2PL, Distributed 2PL and voting 2PL) is viewed Primary copy 2PL, as this protocol is a "transitional" protocol of Centralized 2PL to the Distributed 2PL. T...

متن کامل

Pastwatch: A Distributed Version Control System

Pastwatch is a version control system that acts like a traditional client-server system when users are connected to the network; users can see each other’s changes immediately after the changes are committed. When a user is not connected, Pastwatch also allows users to read revisions from the repository, commit new revisions and share modifications directly between users, all without access to ...

متن کامل

Simulation Studies Of The Implementation Of Centralized Two-Phase Locking In DDBMS

One of the most important problems in distributed database systems is the concurrency control. This paper considers algorithms simulating the implementation of centralized two-phase locking (2PL) in distributed database systems and simulation results. It describes specifically the simulations of two-version 2PL and 2PL with integrated timestamp ordering mechanism. In concurrency control method ...

متن کامل

Distributed Optimistic Concurrency Control for High Performance Transaction Processing

The performance of high-volume transaction processing systems is determined by the degree of hardware and data contention. This is especially a problem in the case of distributed systems with glob~! transactions accessing and updating objects from multiple systems. While the conventional two-phase locking method 11f centralized systems can be adapted for concurrency control in distributed syste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TinyToCS

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2012